[ToC] [Up] [Back] [Next] ... [Book Plug] The Information Commons
.................... Introduction to HTML

2.3 Naming Scheme for HTML Documents

When your HTML browser (Netscape Navigator, Mosaic, Internet Explorer, TkWWW, lynx etc.....) retrieves a file, it must know what type of data it has received in order to know what to do with it. Hypertext (that is, HTTP) servers explicitly tell the browser the type of the data being sent. In other cases, such as when the browser is using FTP or local file access, the browsers guesses the data type from the filename extension -- that is the part after the dot in the filename. For example, HTML files are identified by names such as name.html, where the .html extension indicates an HTML document.

Four letter extensions are common. This is not a problem with UNIX computers or Macintoshes, since these machines place no restriction on the filename. DOS and Windows 3.1 machines are unfortunately restricted to a three letter extension. Generally the extension is truncated to three letters (i.e. .html becomes .htm).

Here are some of the standard extensions, and their meanings:

.html (also .htm)
HTML document, containing text and HTML mark-up instructions.
.txt
A plain text file. The browser presents the file as a block of text and does not process it for mark-up instructions. Browsers generally treat unknown types of data as a text file.
.gif
A GIF format image file.
.xbm
An X-Bitmap (black&white) image file.
.xpm
An X-Pixmap (colour) image file.
.jpeg (also .jpg)
A jpeg-encoded image file.
.mpeg (also .mpg or .mpe)
An mpeg-encoded video file.
.qt
A (Macintosh) QuickTime-format video file
.avi
A (Microsoft) AVI-format video file
.au
An aiff-encoded audio (sound) file.
.Z
A compressed file - compressed using the adaptive Lempel-Ziv coding. This compression/decompression program are commonly found on UNIX computers.
.gz
A compressed file - compressed using the GNU gzip program. This program is common on UNIX computers and is available on PCs and Macintoshes.

MIME Types and File Data Formats

The World Wide Web uses MIME types (Multipurpose Internet Mail Extension) to define the type of a particular piece of transferred information. A browser in turn determines, from the MIME type, how the data should be treated. Each browser has a configuration (menu or file) that maps the types of the data to particular functions. A browser can handle many types of data itself (e.g. HTML documents, GIF images) while other types are passed to auxiliary programs, such as image viewers, movie or sound players, and so on.

HTTP servers send MIME contents-types header messages ahead of every file they deliver to a browser. This header explicitly tells the browser what type of data is being sent. Thus a server must have a way of telling the type of data it is sending. Usually the server has a configuration file that relates filename extensions to the appropriate MIME type. For example, the MIME type for HTML documents is text/html. Thus, if a browser reqests that a server send the file blobs.html, the server first looks up the MIME type corresponding to the .html extension. The server then sends a message to the browser saying that data of content-type text/html is being sent, after which the server sends the actual data.

Other servers, such as FTP servers, do not send this MIME type information. In this case, the browser "guesses" the MIME type, based on the filename extension. Thus each browser must be configured with a list that relates typical extensions to the "most likely" type of data. This is also how a browser determines the type of files accessed locally of the computer.

For more information on MIME types see the Internet Draft document defining the MIME format, namely RFC 1341.


[ToC] [Up] [Back] [Next] ... [Book Plug] .................... Introduction to HTML

© Ian Graham 1994-1996 Page Last Updated: 15 March 1996